Detecting Missing Hyphens in Learner Text

نویسندگان

  • Aoife Cahill
  • Martin Chodorow
  • Susanne Wolff
  • Nitin Madnani
چکیده

We present a method for automatically detecting missing hyphens in English text. Our method goes beyond a purely dictionary-based approach and also takes context into account. We evaluate our model on artificially generated data as well as naturally occurring learner text. Our best-performing model achieves high precision and reasonable recall, making it suitable for inclusion in a system that gives feedback to language learners.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Si3Trenn and Si3Silb: Using the SiSiSi Word Analysis System Pre-hyphenation and Syllable Counting in German Documents

We present two applications of a word analysis system for the German language: pre-hyphenation of documents in various formats, and counting the syllables of all words of a document. The Si3Trenn preprocessor provides pre-hyphenation for file formats allowing for soft hyphens (currently: plain text, LTEX, RTF). It applies reliable, senseconveying hyphenation (SiSiSi) to each word of the input t...

متن کامل

High-Order Sequence Modeling for Language Learner Error Detection

We address the problem of detecting English language learner errors by using a discriminative high-order sequence model. Unlike most work in error-detection, this method is agnostic as to specific error types, thus potentially allowing for higher recall across different error types. The approach integrates features from many sources into the error-detection model, ranging from language model-ba...

متن کامل

Introduction to the teachings of the transcendental paradigm in the process of teaching-learning and its critique

The purpose of this study is to study the teachings of the transcendental paradigm in the process of teaching-learning and its critique. In order to achieve the purpose of the research, three methods of conceptual, inference and critical analysis have been used to analyze and critique the foreman paradigm. Findings of the research indicate that meta-text instead of oral text emphasizes written ...

متن کامل

On the Applicability of Oxford's Taxonomy of Learner Strategies to Translation Tasks

During the last three decades, especially 1980's, language learning specialists have been busy  discovering the nature of language learning strategies, describing them, and formulating their relationships with other language learning factors. In line with these studies, the field of translation studies has undergone a complete revolution in terms of its perspective toward its research prioritie...

متن کامل

Adapting a part-of-speech tagset to non-standard text: The case of STTS

The Stuttgart-Tübingen TagSet (STTS) is a de-facto standard for the part-of-speech tagging of German texts. Since its first publication in 1995, STTS has been used in a variety of annotation projects, some of which have adapted the tagset slightly for their specific needs. Recently, the focus of many projects has shifted from the analysis of newspaper text to that of non-standard varieties such...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013